智能论文笔记

Gait Recognition Based on Deep Learning: A Survey

Claudio Filipi Gonçalves dos Santos , Diego de Souza Oliveira , Leandro A. Passos , Rafael Gonçalves Pires , Daniel Felipe Silva Santos , Lucas Pascotti Valem , Thierry P. Moreira , Marcos Cleison S. Santana , Mateus Roder , João Paulo Papa

分类：计算机视觉 | 机器学习

2022-01-10

通常，基于生物谱系的控制系统可能不依赖于各个预期行为或合作适当运行。相反，这种系统应该了解未经授权的访问尝试的恶意程序。文献中提供的一些作品建议通过步态识别方法来解决问题。这些方法旨在通过内在的可察觉功能来识别人类，尽管穿着衣服或配件。虽然该问题表示相对长时间的挑战，但是为处理问题的大多数技术存在与特征提取和低分类率相关的几个缺点，以及其他问题。然而，最近的深度学习方法是一种强大的一组工具，可以处理几乎任何图像和计算机视觉相关问题，为步态识别提供最重要的结果。因此，这项工作提供了通过步态认可的关于生物识别检测的最近作品的调查汇编，重点是深入学习方法，强调他们的益处，暴露出弱点。此外，它还呈现用于解决相关约束的数据集，方法和体系结构的分类和表征描述。

translated by 谷歌翻译

Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments

Abhiram Iyer , Karan Grewal , Akash Velu , Lucas Oliveira Souza , Jeremy Forest , Subutai Ahmad

分类：神经与进化计算 | 人工智能 | 机器学习

2021-12-31

AI的一个关键挑战是构建体现的系统，该系统在动态变化的环境中运行。此类系统必须适应更改任务上下文并持续学习。虽然标准的深度学习系统实现了最先进的静态基准的结果，但它们通常在动态方案中挣扎。在这些设置中，来自多个上下文的错误信号可能会彼此干扰，最终导致称为灾难性遗忘的现象。在本文中，我们将生物学启发的架构调查为对这些问题的解决方案。具体而言，我们表明树突和局部抑制系统的生物物理特性使网络能够以特定于上下文的方式动态限制和路由信息。我们的主要贡献如下。首先，我们提出了一种新颖的人工神经网络架构，该架构将活跃的枝形和稀疏表示融入了标准的深度学习框架中。接下来，我们在需要任务的适应性的两个单独的基准上研究这种架构的性能：Meta-World，一个机器人代理必须学习同时解决各种操纵任务的多任务强化学习环境;和一个持续的学习基准，其中模型的预测任务在整个训练中都会发生变化。对两个基准的分析演示了重叠但不同和稀疏的子网的出现，允许系统流动地使用最小的遗忘。我们的神经实现标志在单一架构上第一次在多任务和持续学习设置上取得了竞争力。我们的研究揭示了神经元的生物学特性如何通知深度学习系统，以解决通常不可能对传统ANN来解决的动态情景。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

Frederico Dias Souza , João Baptista de Oliveira e Souza Filho

分类：自然语言处理 | 人工智能

2022-12-01

Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic language aspects, like irony and nuance. To accomplish this task, one must provide a robust numerical representation for documents, a process known as embedding. Embedding represents a key NLP field nowadays, having faced a significant advance in the last decade, especially after the introduction of the word-to-vector concept and the popularization of Deep Learning models for solving NLP tasks, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer-based Language Models (TLMs). Despite the impressive achievements in this field, the literature coverage regarding generating embeddings for Brazilian Portuguese texts is scarce, especially when considering commercial user reviews. Therefore, this work aims to provide a comprehensive experimental study of embedding approaches targeting a binary sentiment classification of user reviews in Brazilian Portuguese. This study includes from classical (Bag-of-Words) to state-of-the-art (Transformer-based) NLP models. The methods are evaluated with five open-source databases with pre-defined data partitions made available in an open digital repository to encourage reproducibility. The Fine-tuned TLMs achieved the best results for all cases, being followed by the Feature-based TLM, LSTM, and CNN, with alternate ranks, depending on the database under analysis.

translated by 谷歌翻译

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Otávio Parraga , Martin D. More , Christian M. Oliveira , Nathan S. Gavenski , Lucas S. Kupssinskü , Adilson Medronha , Luis V. Moura , Gabriel S. Simões , Rodrigo C. Barros

分类：机器学习 | 人工智能 | 自然语言处理 | 计算机视觉

2022-11-10

Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.

translated by 谷歌翻译

Large-Margin Representation Learning for Texture Classification

Jonathan de Matos , Luiz Eduardo Soares de Oliveira , Alceu de Souza Britto Junior , Alessandro Lameiras Koerich

分类：计算机视觉 | 机器学习

2022-06-17

本文提出了一种新的方法，该方法结合了卷积层（CLS）和大规模的度量度量，用于在小数据集上进行培训模型以进行纹理分类。这种方法的核心是损失函数，该函数计算了感兴趣的实例和支持向量之间的距离。目的是在迭代中更新CLS的权重，以学习一类之间具有较大利润的表示形式。每次迭代都会产生一个基于这种表示形式的支持向量表示的大细边缘判别模型。拟议方法的优势W.R.T.卷积神经网络（CNN）为两倍。首先，由于参数数量减少，与等效的CNN相比，它允许用少量数据进行表示。其次，自返回传播仅考虑支持向量以来，它的培训成本较低。关于纹理和组织病理学图像数据集的实验结果表明，与等效的CNN相比，所提出的方法以较低的计算成本和更快的收敛性达到了竞争精度。

translated by 谷歌翻译

Shallow decision trees for explainable $k$-means clustering

Eduardo Laber , Lucas Murtinho , Felipe Oliveira

分类：机器学习

2021-12-29

最近的一些作品已经采用了决策树，以建造可解释的分区，旨在最大限度地减少$ k $ -means成本函数。然而，这些作品在很大程度上忽略了与所得到的树中叶子的深度相关的度量，这考虑到决策树的解释性如何取决于这些深度，这可能令人惊讶。为了填补文献中的这种差距，我们提出了一种有效的算法，它考虑了这些指标。在7个数据集上的实验中，我们的算法产生的结果比决策树聚类算法，例如\ Cite {dasgupta2020explainplainable}，\ cite {frost2020exkmc}，\ cite {laber2021price}和\ cite {dblp：conf / icml / Makarychevs21}通常以相当浅的树木实现较低或等同的成本。我们还通过简单适应现有技术来表明，用k $ -means成本函数的二叉树引起的可解释的分区的问题不承认多项式时间中的$（1+ \ epsilon）$ - 近似$ p = np $，证明Questies Quest attmation算法和/或启发式。

translated by 谷歌翻译

Unsupervised machine learning approaches to the $q$-state Potts model

Andrea Tirelli , Danyella O. Carvalho , Lucas A. Oliveira , J. P. Lima , Natanael C. Costa , Raimundo R. dos Santos

分类：机器学习

2021-12-13

本文通过研究阶段转换的$ Q $State Potts模型，通过许多无监督的机器学习技术，即主成分分析（PCA），$ K $ - 梅尔集群，统一歧管近似和投影（UMAP），和拓扑数据分析（TDA）。即使在所有情况下，我们都能够检索正确的临界温度$ t_c（q）$，以$ q = 3,4 $和5 $，结果表明，作为UMAP和TDA的非线性方法依赖于有限尺寸效果，同时仍然能够区分第一和二阶相转换。该研究可以被认为是在研究相转变的调查中使用不同无监督的机器学习算法的基准。

translated by 谷歌翻译

Computational simulation and the search for a quantitative description of simple reinforcement schedules

Paulo Sergio Panse Silveira , José de Oliveira Siqueira , João Lucas Bernardy , Jessica Santiago , Thiago Cersosimo Meneses , Bianca Sanches Portela , Marcelo Frota Benvenuti

分类：人工智能

2021-11-27

我们的目标是讨论其在其理论和实践术语中讨论了强化的计划，指出了在讨论计算模拟的优势的同时实施这些时间表的实际限制。在本文中，我们展示了一个名为喙的R脚本，建立了模拟与加固时间表交互的行为速率。使用喙，我们已经模拟了允许评估不同强化反馈功能（RFF）的数据。这是通过无与伦比的精确度制作的，因为模拟提供了巨大的数据样本，更重要的是，它产生的加强不会改变模拟行为。因此，我们可以系统地改变它。我们将不同的RFF与RI时间表进行了比较，用作标准：意义，精确，分析和一般性。我们的结果表明，RI计划的最佳反馈函数由BAUM（1981）公布。我们还建议Killeen（1975）使用的模型是RDRL计划的可行反馈函数。我们认为喙铺平了更多了解加强时间表，解决了关于时间表的定量特征的开放问题。此外，他们可以指导将来使用时间表作为理论和方法工具的实验。

translated by 谷歌翻译

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Arnaldo Candido Junior , Edresson Casanova , Anderson Soares , Frederico Santos de Oliveira , Lucas Oliveira , Ricardo Corso Fernandes Junior , Daniel Peixoto Pinto da Silva , Fernando Gorgulho Fayet , Bruno Baldissera Carlotto , Lucas Rafael Stefanel Gris

分类：自然语言处理

2021-10-14

自动语音识别（ASR）是一个复杂和具有挑战性的任务。近年来，该地区出现了重大进展。特别是对于巴西葡萄牙语（BP）语言，在2020年的下半年，有大约376小时的公众可供ASR任务。在2021年初发布新数据集，这个数字增加到574小时。但是，现有资源由仅包含读取和准备的演讲的Audios组成。缺少数据集包括自发性语音，这在不同的ASR应用中是必不可少的。本文介绍了Coraa（注释Audios语料库）V1。使用290.77小时，在包含验证对（音频转录）的BP中ASR的公共可用数据集。科拉还含有欧洲葡萄牙音像（4.69小时）。我们还提供了一个基于Wav2VEC 2.0 XLSR-53的公共ASR模型，并通过CoraA进行微调。我们的模型在CoraA测试集中实现了24.18％的单词误差率，并且在常见的语音测试集上为20.08％。测量字符错误率时，我们分别获得11.02％和6.34％，分别为CoraA和常见声音。 Coraa Corpora在自发言论中与BP中的改进ASR模型进行了组装，并激励年轻研究人员开始研究葡萄牙语的ASR。所有Corpora都在CC By-NC-ND 4.0许可证下公开提供Https://github.com/nilc-nlp/coraa。

translated by 谷歌翻译